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ABSTRACT 

We study the spatial distribution of X-ray selected AGN in the framework of hierarchical co- 
evolution of supermassive black holes (BHs) and their host galaxies an d dark matter (DM ) 
haloes. To this end, we h ave appli ed the theo r etical model developed by lCroton et al.l (|2006), 
iDe Lucia & Blaizotl d2007l) and llvlarulli et alj d2008l) to the output of the Millennium Run and 
obtained hundreds of realizations of past light-cones from which we have extracted realistic 
mock AGN catalogues that mimic the Chandra deep fields. We find that the model AGN 
number counts are in fair agreement with observations both in the soft and in the hard X-ray 
bands, except at fluxes < 10 -15 ergcm _2 s _1 , where the model systematically overestimates 
the observations. However, a large fraction of these faint objects is typically excluded from 
the spectroscopic AGN samples of the Chandra fields. We find that the spatial two-point 
correlation function predicted by the model is well described by a power-law relation out 
to 20 /i _1 Mpc, in close agreement with observations. Our model matches the correlation 
length ro of AGN in the Chandra Deep Field North but underestimates it i n the Chandra 
Deep Field South. When fixing the slope to y = 1.4, as in iGilli et alj (|2005), the statistical 
significance of the mismatch is 2-2.5 a, suggesting that the predicted cosmic variance, which 
dominates the error budget, may not account for the different correlation length of the AGN in 
the two fields. However, the overall mismatch between the model and the observed correlation 
function decreases when both ro and y are allowed to vary, suggesting that more realistic AGN 
models and a full account of all observational errors may significantly reduce the tension 
between AGN clustering in the two fields. While our results are robust to changes in the model 
prescriptions for the AGN lightcurves, the luminosity dependence of the clustering is sensitive 
to the different lightcurve models adopted. However, irrespective of the model considered, the 
luminosity dependence of the AGN clustering in our mock fields seems to be weaker than in 
the real Chandra fields. The significance of this mismatch needs to be confirmed using larger 
datasets. 

Key words: galaxies: active - galaxies: formation - cosmology: observations - cosmology: 
theory 



1 INTRODUCTION 

A cosmological co-evolution of DM structures, galaxies and 
BH s is expected within the standard ACDM framework (see , 
e.g.lVolonteri eta l. 2003; Cattaneo et al. 2005; Marulli 1-t"aT1l2006l ; 
Croton et all 120061: i Fontanot et"all 120061 : iMalbon et ail |2007| ; 
Hopki ns et all 120081 ; iMarulli et alj 120081. and references therein) 
and strongly supported by several observational evidences like, 
for example, the BH scaling relations a nd the luminosity func- 
tion of galaxies and AGN (see, e.g. iMagorrian etal J 1998; 



i Tremaine et alj 12002 ; IMarconi & Hund 120031 ; IFerrarese & Ford 
120051 : iHopkins et aljhool iGrahaml 12008). Modelling these ob- 
servations is a significant challenge for modern computational as- 
trophysics, as it requires to self-consistently account for complex 
physical processes acting both on very large scales, like the ones 
related to galaxy formation and evolution, and on very small scales, 
like the gas cooling and the mass accretion onto the central BHs. 

The computational cost of full cosmological hydrodynami- 
cal simulations is very high, and only few attempts have been 
made thus far to directly follow the co-evolution of BHs and their 
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host galaxies within large c osmological volumes from high red- 
shifts to the present epoch l Li et alj 20071 ; IPelupessv et alj|2007l ; 
ISijacki et alJl2007k I Di Matteo et alJl2008h . Moreover, every mod- 
ification of the prescriptions used to encapsulate the 'sub-grid' 
physics requires the simulations to be repeated. A popular, con- 
siderably less time consuming alternative is to run high-resolution, 
cosmological simulations of the DM component alone and apply 
semi-analytic prescriptions in post-processing to model the dif- 
fuse galactic gas and its accretion onto the central BH. Using 
this 'hybrid' approach, a galaxy formatio n model has been i m- 
plemented on top of the Millennium Run dSpringel et aTll2005h . a 
very large simulation of the concordance ACDM cosmology, which 
follows the DM evolution from z = 127 to the present, in a co- 
moving box of 500 h Mpcon a side and with a comoving scale 
resolution of 5 /i -1 kp c. The galaxy format ion model has been 
origin ally proposed by Springe fet al.1 d200ll) and |Pe Lucia et al.l 
d2004l) an d subsequently upd a ted to include a 'radio mo de' BH 
feedback dCroton et alj hoOfJ : | De Lucia & Blaizotl l2007h and to 
self-consistently describe the BH mass accretion rate triggered by 
galax y merger events ('qu asar' mode) and its conversion into radi- 
ation dMarulli et"aT]|2008l hereafter M08). The model outputs are 
publicly available at the Millennium download site at the German 
Astro physical Virtual ObservatorjQ ( Lemso n~& Virgo Consortium! 
l200d) . 

Here, we use an updated version of the model as presented 
in M08. In several previous works the model has been extensively 
compared to a large set of observational data. Thanks to the 'radio 
mode' BH feedback, the model is able to reproduce the observed 
low mass drop-out rate in cooling flows, the exponential cut-off 
in the bright end of the galaxy luminosity function and the bulge- 
dominated morpholog ies and old stella r ages of the most massive 
galaxies in clusters dCroton et af]|2006l) . In fact, model predictions 
are in agreement with se veral different properties of the galaxy and 
BH populations (see e.g . lDe Lucia et al.l2004l200d:ISpringel et al. l 
2004 IWang et alj|2007t ICroton & Farrarll2008l ; lDe Lucia & Helmil 
200 81 and reference therein). In M08 the model predictions have 



ring in massive haloes, highly biased with respect to the underlying 
mass distribution. On the contrary, if they are short-lived they likely 
reside in typical haloes that are less clustered than the massive ones. 

In recent years, wide-field surveys of optically selected AGN 
have enabled tight measuremen ts of the unobscured (type-1) AGN 
clustering up to z ~ 3 (s ee e.g. Porci ani et al. |l2004lGrazianetal.l 



been compared to the observed scaling relations, fundamental plane 
and mass function of BHs, and to the luminosity function of AGN. 
The agreement between predicted and observed BH properties is 
generally quite good. Also, the AGN luminosity function can be 
well matched over the whole redshift range, provided it is assumed 
that the cold gas fraction accreted by BHs at high redshifts is larger 
than at low redshifts. Despite this success, some authors found dis- 
cre pancies between model predictions and some observations (see 
e.g.lWeinmann et alj|200r3; llCitzbichler & W hite 2007; lElbaz et al.l 
l2007l;lMcCracken et alj2007llGilli et alj2007l ; lMateusll2008l) . This 
suggests that several improvements in the physical assumptions 
of the semi-analytic model are needed to make the model predic- 
tions agree closer with these observations. However, this is beyond 
the scope of the present work, in which we focus on studying the 
present model predictions about the BH and AGN populations, ex- 
tending the analysis of M08. 

In this work, we focus on the AGN clustering, which repre- 
sents an additional, fundamental observational property that pro- 
vides further constraints to the theoretical models. Together with 
the AGN luminosity function, the galaxy mass function and their 
bias, the AGN clustering can be used to constrain the masses of the 
AGN host galaxies, and thus the AGN lifetimes. In fact, if AGN are 
long-lived sources, then they are probably rare phenomena occur- 
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l2004ICroom et al.l2005l ; |Porciani & No rberg 2003)- The use of X- 
ray selected AGN catalogues allows one to include also obscured 
(type-2) objects, thus minimizing the impact of bolometric correc- 
tions. However, such observational studies have been limited by the 
lack of sizeable sample s of optica l ly ide ntified X-ray sources. To 
overcome this problem, iGilli et alj ( 00051) used the two deepest X- 
ray fields to date, i.e. the 2Msec Chandra D eep Field North (CDFN, 
I Alexander et alj 120031: iBarger et ai]|2003l) and th e lMsec Chan- 
dra D eep Field South (CDFS, iRosati et all 120021: iGiacconi et ail 
l2002tPl Limiting fluxes (in ergcm _2 s _1 ) of ~ 2.5 x 10 -17 and ~ 
1.4 x 10" 16 for the CDFN and of ~ 5.5 x 10~ 17 and ~ 4.5 x 10~ 16 
for the CDFS have been reached in the soft (0.5-2 keV) and hard 
(2-10 keV) X-ray bands, respectively. A sample of 503 sources in 
the CDFN and 346 sources in the CDFS has been collected over 
two areas of 0.13 and 0.1 deg 2 , respectively. The correlation prop- 
erties of the AGN in these two fields turned out to be quite different 
since the correlation length, ro, measured in the CDFS is a factor 
of ~ 2 higher than in the CDFN dGilli et alj|2005h . As it seems un- 
likely that this difference can be due only to observational biases, 
it has been argued that it could be accounted for if one includes the 
cosmic variance, supposedly large in these deep fields, in the error 
budget. 

To successfully discriminate between different AGN models 
one needs to account for all possible systematic errors that may 
plague the comparison between theoretical predictions and obser- 
vations. For this purpose, we construct a large set of mock AGN 
catalogues that mimic as close as possible the observed properties 
of the X-ray selected AGN in the two Chandra fields and account 
for all known observational biases. We then use these simulated 
samples to 'observe' the number counts of mock AGN and their 
clustering properties that we then compare to observations. Thanks 
to the large box of the Millennium Simulation where many such 
independent samples can be extracted from, we can directly assess 
the impact of the cosmic variance by measuring the field-to-field 
variation of the mock AGN clustering properties. 

The paper is organized as follows. In Section [2] we briefly 
discuss the main aspects of the hybrid simulation used to construct 
the mock AGN catalogues. In Section[3] we describe the technique 
used to extract realistic mock Chandra fields from the Millennium 
Simulation. We compare the predicted AGN number counts and 
spatial clustering with those measured in the Chandra Deep Fields 
in Section [4] Finally, in Section [5] we summarize our conclusions 
and discuss our results. 



2 THE AGN MODEL 

The hybrid simulat ion used in this paper is described in detail in 
ICroton et ail d2006l) and lDe Lucia & BlaizoJ d2007l) . In the follow- 
ing, we just give a brief description of the main features of the 



The CDFS exposure has been recently extended to 2 Msec, and an up- 
dated X-ray catalogue has been already released iLuo et al. 2008). In this 
work, ho wever, we will keep working with the lMsec X-ray source cata- 
logue of G iacconi et alj 0002), for which optical identification is almost 
complete. 
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model and review the new semi-analytic recipes recently included 
by M08 to describe the AGN evolution. 

2.1 DM haloes and galaxies 

The model simulates the co-evolution of DM haloes, galaxies and 
their central BHs in the ACDM 'concordance' cosmological frame- 
work, with parameters Q. m = 0.25, fib = 0.045, Q.\ = 0.75, h = 
W /100kms _1 Mpc _1 = 0.73, n= 1, and o 8 = 0.9, consistent with 
determinations from the combined analysis of the 2-d egree Field 
Galaxy Redshift S urvey (2dFGRS) dColless et alj|200ll) and first- 
year W MAP data dSpergel et alj|2003h . as shown bv lSanchez et al.l 
( 2006). The DM evolution is described through a numerical N-body 
simulation, the Millennium Run, which followed the dynamics of 
2160 3 ~ 10 10 DM particles with mass 8.6 x 10 s h^Mp in a peri- 
odic box of 500/!~'Mpc on a side dSpringel et al.ll2005h . 

The baryonic physics is implemented in a post-processing 
phase, by exploiting the merging trees of DM haloes extracted from 
the simulation. Two different techniques have been used to identify 
DM haloes and their substructures: the friends-of-friends (FOF) 
roup-finder and an updated version of the SUBFIND algorithm 
Springel et alj200lh . To establish the baryons to DM halo connec- 
tion we assume that, when DM haloes colla pse a fixed mass baryo n 
fraction collapses along, as proposed by White & F renkl d 199lh - 
The baryon component, initially in the form of diffuse, pristine gas, 
forms stars and change its chemical composition. The evolution 
of this diffuse gas is regulated by heating and cooling processes 
described by using physically motivated prescriptions. The photo- 
ionization heating of the intergalactic medium is invo ked to sup- 
press the concentration of baryons in shallow potentials dEfstafhiotl 
19921) and to make the accretion and cooling in low-mass haloes in- 
efficient. The star formation rate is assumed to be proportional to 
the cold gas mass of the galaxy, while the supernovae reheating of 
the hot interstellar gas medium is proportional to the mass of stars. 
If an excess of SN energy is present after reheating material to the 
halo virial temperature, then an appropriate amount of gas leaves 
the DM halo in the form of a 'super- wind'. Gala xy disk instabilit y 
is modelled using the analytic stability criterion o f lMoetalJ ( fT993) . 
DM substructures are followed until tidal truncation and stripping 
disrupt them, or they fall below a mass of 1.7 x lO lo /i _1 M . At this 
point, a survival time is estimated usin g the subhalo's current orbi t 
and the dynamical friction form ula of Binnev & Tremaine] dl987h 
multiplied by a factor of 2, as in lDe Lu cia & Blaizot (2007). After 
this time, the galaxy is assumed to merge onto the central galaxy of 
its own halo. The starburst triggered b y galaxy mergers is mo delled 
with the prescriptions introduced bv lSomerville et al] d200ll) . 

In Fig. Q] we show a typical merger tree in our model. The 
sizes of brown and black dots are proportional to the stellar mass 
of the galaxies and to the mass of the central BHs, respectively. 
The red stars indicate the presence of an AGN and their sizes are 
proportional to the AGN bolometric luminosities. In the example 
shown, the merging history of a parent galaxy with stellar mass 
= 3.4 ■ 10 11 h Mq is traced back in time from z = 0, at the 
bottom of the plot, out to z ~ 10. 



2.2 Supermassive black holes 

In order to populate our model galaxies with BHs and AGN, we 
adopt the following assumptions. The BH mass accretion is trig- 
gered by two different phenomena: i) the merger between gas-rich 
galaxies and ii) the cooling flow at the centres of X-ray emitting 
atmospheres in galaxy groups and clusters. 
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Figure 1. A typical galaxy merger tree in our model. The sizes of brown and 
black dots are proportional to the stellar mass of the galaxies and to the mass 
of the central BHs, respectively. The red stars indicate the presence of an 
AGN and their sizes are proportional to the AGN bolometric luminosities. 
In the example shown, the merging history of a parent galaxy with stellar 
mass M t = 3.4- 10 11 ft - ' M ( r, is traced back in time fromz = 0, at the bottom 
of the plot, out to z ~ 10. The variable on the horizontal axis represents 
the displacement between the parent galaxy and its progenitor, defined as 
-fgal = L?=i(- V gal ~ - v par)' wnere -^gai an ^ ^par represent the three Cartesian, 
comoving components of the progenitor and the parent galaxy, respectively, 
in unit of h Mpc . 



The first kind of accretion, dubbed quasar mode, is closely 
associated with starbursts. Many recent works seem to indicate 
that major merg ers do not constitute the only trigger to BH ac- 
cretion (see e.g. | Marulli et alj200 7: Kauffmann & He ckmar]|2008l : 
iHopkins & Hernq uist 2008; Isilverman et al J I2008L and reference 
therein). For this reason, we assume here that any galaxy merger 
can trigger perturbations to the gas disk and drives gas onto the 
galaxy centre. BHs can accrete mass both through coalescence with 
another BH and by accreting cold gas, the latter being the domi- 
nant accretion mechanism. The gas mass accreted during a merger 
is assu med to be proportional to th e total cold gas mass of the 
galaxy jKauffmann & HaehneljioorJ) . but with an efficiency which 
is lower for smaller mass systems and for unequal mergers: 

AM bh = /bh^ =2^- , (1) 

where m SSL i/m cenlm \ is the total mass ratio of merging galaxies, m co i,j 
and Vyjj-280 Me tne c °ld 8 as mass and the virial velocity (in units 
of 280kms _1 ) of the central galaxy, respectively. The parameter 
/bh ~ 0. 03 is chosen to repr oduce the observed local Mbh — A^bulge 
relation dCroton et alj|2006i) . The accretion driven by major merg- 
ers is the dominant mode of BH growth in this scenario. Its energy 
feedback, which has not been included in the model so far, is ap- 
proximated by an enhanced effective feedback efficiency for the 
supernovae associated with the starburst. 

Once a static hot halo is formed around a galaxy, we assume 
that the radio mode sets in, in which a fraction of the hot gas quies- 
cently accretes onto the central BH. During this phase, the accretion 
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rate is typically orders-of-magnitude below the Eddington limit, so 
that the growth of the BH mass is negligible compared to during the 
quasar mode phase. However, the energy feedback associated with 
it injects enough energy into the surrounding medium to reduce or 
even stop the cooling flow in the halo centres. In this scenario, the 
effectiveness of radio AGN in suppressing cooling flows is greatest 
at late times and for large values of the BH mass. 

The mass accretion onto the BHs and the associated bolomet- 
ric luminosity emitted can be described as follows: 



Lbo\{t) =/Edd(0^Edd( f ) 



d In M m (t) 
dt 
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where L^dd ls the Eddington luminosity, t e f(t) = t§t / Ed ?,i is the 

1 fc /Edd [*) 

e-folding time (f e f = fsalpeter if /Edd = 1)- e is the radiative effi- 
ciency, /EddM is the Eddington factor and fgdd = Ore/ (A%m p G) ~ 
0.45 Gyr. As in M08, we do not follow the evolution of the BH 
spins and we take a constant mean value for the radiative efficiency 
of £ = 0. 1 at all redshifts. 

We consider three different prescriptions to model /Edd> 
which determines the lightcurves associated with individual quasar 
events: 

• I- /Edd = 1> the simplest possible assumption. 

* II- fvM is assu med to de crease at low z as suggested by 
ICattaneo & Bernardil d2003h and lShankar et alj d2004l) to match the 
BH mass function derived from a deconvolution of the AGN lumi- 
nosity functi on and the local BH m ass function. Here, we adopt the 
fit derived bv lShankar et al.1 d2004» : 
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• ///: the evolution of an active BH is described as a two-stage 
process of a rapid, Eddington-limited growth up to a peak BH 
mass, preceded and followed by a much longer quiescent phase 
with lower Eddington ratios. In this latter phase, the average time 
spent by AGN per logarithm ic luminosity interval can be approxi- 
mated as (Hopkins et al. 20051) 
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(5) 



where tq is the total AGN lifetime above 1O 9 L ; tg ~ 10 9 yr over 
the range ICPLq < L\, \ < L pea k> where L pea k is the AGN luminosity 



at the peak of its act ivity. In the range 10 L@ < L pea k < 10 L@, 
Hopki ns et al. U2005h found that a is a function of only L pea k, given 
by a = -0.95 + O.321og(L peak /lO 12 L ), with a = -0.2 (the ap- 
proximate slope of the Eddington-limited case) as an upper limit. 
Here we interpret the Hopkins model as describing primarily the 
decline phase of the AGN activity, after the BH has grown at the 
Eddington rate to a peak mass MBH.peak = ^bh (hn ) + T ' A^bh.Q ■ 
(1 — e), where Mbh(*ui) is the initial BH mass and AMbh.q is the 
fraction of cold gas mass accreted. We found that J = 0.7 is the 
value that best matches the AGN luminosity function (M08). 
From equation l[5j we can derive: 



^Bh(0 = ^BH.peak ' 



whereA= l^Mm*^, B -- 
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underestimate the number density of luminous AGN at high red- 
shifts, independently of the lightcurve model adopted. A significant 
improvement can be obtained by si mply assuming an a ccretion effi- 
cienc y that increases with redshift dCroton et al.l20 06). In a parallel 
work. lBonoli et al] d2008h . we discuss a model in which the accre- 
tion efficiency is linearly dependent on redshift. In the present work 
however, since our aim is to construct mock catalogues that best re- 
produce the observed AGN population, we will use the model for 
the accretion efficiency introduced in M08 to obtain a good match 
to the AGN luminosity function: 



/ BH = 0.01 log 
AM BH = 0.01-m co id 



Mm 
WW 



z> 1.5andM BH > 1O 6 M 
z>6 



As shown in M08, the semi-analytic models described above 



(7) 

Here we keep the prescription /// for the quasar lightcurves and, 
for simplicity, we assume Mawseed = 1O 3 M for all seed BHs, ir- 
respective of their halo host properties and their origin. As in M08, 
we will refer to this scenario as our best model. Note, however, that 
future improvements in the underlying physical assumptions may 
well lead to a yet better model in explaining the observations. 



3 SIMULATING THE CHANDRA DEEP FIELDS 

In order to directly compare our model predictions to the observed 
number counts and spatial clustering of the X-ray selected AGN in 
the CDFN and CDFS, we construct a suite of realistic mock AGN 
catalogues that mimic the selection effects of the real data. The aim 
is to account for all uncertainties stemming from the conversions 
between observed and intrinsic AGN properties and to estimate 
statistical errors. Systematic errors are accounted for by modeling 
the AGN samples selection effects. Random errors contributed by 
sparse sampling in the flux limit catalogues and cosmic variance are 
also taken into account by considering several independent mock 
samples of AGN with number density comparable to that of the real 
Chandra fields. Our realistic mock catalogs are obtained by con- 
structing backward light cones from the outputs of the Millennium 
Simulation^. To do this, we have to take into account that redshift 
varies continuously, whereas the outputs of a simulation have been 
stored at a finite set of redshifts. To interpolate between discrete 
redshifts, we have used a technique similar to the standard approach 
descr i bed in the literature (see e.g.lCroft et alJl200ll:|Blaizot et al.l 
120051: iRoncarelli etai]|2006l : iKitzbichler & Whitej|2007l) . in which 
the stacking of several computational boxes corresponding to dif- 
ferent redshift outputs is performed in comoving coordinates. 

To construct mock Chandra fields, we have considered the 
spatial position and bolometric luminosity of the model AGN in 
the Millennium Simulation, specified at the available output red- 
shifts, spac ed in expansion facto r according to log(l +z„) = n(n + 
35)/4200 dSpringel et alj|2005h . As a first step, we randomly lo- 
cate a virtual observer in the box at z = and transform the co- 
ordinates to have it at the centre. Then we construct its backward 
light cone, which extends to z = 5.72, corresponding to a comov- 
ing distance of ~ 6000 h^ 1 Mpc in our cosmological model, so one 



3 A light cone is a three-dimensional hypersurface, in space-time coordi- 
nates, satisfying the condition that light emitted from every point is received 
by an observer at z = 0. Its space-like projection is the volume of the sphere 
denning the observer's current particle horizon. The observer's field of view 
is the projection on the celestial sphere of a three-dimensional submanifold, 
in space coordinates, located inside the observer's particle horizon. 
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Figure 2. Space-like projections of a past light cone and of a mock field of view with the same selection effects as the CDFS. Left-hand panel: all the AGN 
predicted by the lightcurve model I (black dots) inside a virtual past light cone and the subpopulation of AGN in a mock CDFS (green dots). Right-hand panel: 
zoom of the mock field of view represented by the green points in the left panel. The black dots show the AGN that meet the flux-selection criteria. Their 
number counts and redshift distributions are displayed in Figs.[5]and|4] respectively. The blue and red dots show the type-1 and type-2 AGN in the mock 
spectroscopic subsample specified in Section POl used to compute the two-point correlation. The size of the red and blue dots in the upper right panel scales as 
the logarithm of the AGN observed flux. 



would need to stack the simulation volume roughly 12 times. How- 
ever, we can take advantage of the much denser redshift sampling 
of the output times (there are ~ 45 different outputs between z — 
and z = 5.72) by adopting the following procedure. We divide the 
light cone into slices along the line of sight based on the output 
times, so that each slice corresponds to one output and covers the 
redshift range closest to this output time. To avoid having replicas 
of the same cosmic structures along the line of sight, we exploit the 
periodic boundar y conditions and adop t the same scrambling tech- 
nique used by iRoncarelli et al.l (2006). All CDFs were extracted 
from different light cones. The procedure is repeated 100 times, 
for each of the 4 lightcurve models considered and for the CDFS 
and CDFN separately (totaling to 400+400 mock CDFs samples). 
To perform the analysis described in Section 4.2, it is important 
to estimate how many of theses samples are statistically indepen- 
dent. This can be done by comparing the volume of each sample 
to that of the Millennium Simulation box, taking into account that 
the very rare AGN with z > 2 do not affect the clustering property 
of the sample and can be safely excluded from the spatial correla- 
tion analysis, as we did check. It turns out that, for each lightcurve 
model, all the 100+100 CDFs extracted from the Millennium box 
are independent and will be treated as such in the rest of this work. 
Mock Chandra fields are obtained by mimicking the selection ef- 
fects of the real samples. To do this, we identify all AGN with the 
BHs in the quasar phase and discard those too faint to meet the 
flux-selection criteria. The latter are based on the flux measured 
in the soft and hard X-ray bands, while our models predict bolo- 
metric luminosity. To convert intrinsic bolometric luminosities into 
soft and hard X-ray ban d s, we use the bolometric correction pro- 
posed by Hop kins et all {2006), which assumes that the average 



AGN X-ray spectrum beyond 0.5 keV can be approximated by a 
power-law with an intrinsic photon index F = 1.8. To transform the 
intrinsic flux into the observed one, we need to account for pho- 
ton absorption along the line of sight. To do that, we impose that 
the intrinsic hydrogen column densit i es, Nh , of our AGN are dis- 
tributed according to lLa Franca et al. I d2005l) . and that the Galactic 
Nh towards the CDFN and CDFS is (1.3 ±0.4) x 10 20 cnr 2 and 
(8.8 ± 0.4) x 10 19 cm -2 , respective ly. We have check ed that using 
the Nh distribution as proposed by iGilli et all d2007l) has a negli- 
gible effect on the final results. Only AGN with observed fluxes 
above the limit Fn m i t of the CDFN and CDFS are included in our 
mock catalogues. The value of Fi; m i t in the CDFN and CDFS varies 
across the field of view. We account for this effect by adopting the 
dependency of Fij m j t from the angu lar distance from the fie lds' cen- 
tre given bv lGiacconi et alj j2002l) and lBauer etail d2004l) . 

We have subdivided all mock AGN into type-1 and type-2, 
according to their A'h absorption. AGN with A'h < 10 22 cm~ 2 are 
classified as type-1, the more absorbed are classified as type-2. This 
classification corresponds fairly well to the optical separation into 
broad-line and narrow-line AGN. All mock CDFN and CDFS pairs 
are extracted at large angular separation to guarantee independent 
spatial correlation properties. 

The left panels in Fig.|2]show the three space-like projections 
of a simulated past light cone and of a mock field of view with the 
same selection effects as the CDFS. The small, black dots represent 
all model AGN within the cone predicted by the lightcurve model 
I. The larger, green dots indicate all AGN within a mock CDFS, 
placed at the centre of the box. The panels on the right zoom in the 
mock CDFS. In this case, however, the black dots show the AGN 
that meet the flux-selection criteria specified above. The larger blue 
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log (S [erg cm 2 s ']) log (S [erg cm 2 s ']) 

Figure 3. The predicted AGN number counts in the mock CDFs compared to the one determinated bv lBaueretal] j2004l) . The left-hand and right-hand 
panels display the number counts of the AGN selected in the soft and hard X-ray bands, respectively. The dark and light grey shaded areas show the observed 
AGN counts obtained with two different classification schemes used to separate AGN from star-forming galaxies. Model predictions: the dashed black curves 
represents the median of all 100 CDF mocks and the bands indicate the 5th and 95th percentiles. Different colours characterize the different lightcurve models 
described in Section l2!2l as indicated by the labels. 



and red dots show the type- 1 and type-2 AGN in the mock spectro- 
scopic subsample denned in Section l4~2l that will be used to com- 
pute the two-point correlation function. The size of the red and blue 
dots in the upper right panel scales as the logarithm of the AGN ob- 
served flux. 



4 MODEL VS. OBSERVATIONS 

In this Section, we compare the AGN number counts and spatial 
clustering predicted by our model with the ones measured in the 
CDFs. We quantify the dependence of our predictions on the AGN 
obscuration and on the X-ray selection band. We estimate the ef- 
fect of the cosmic variance in these deep fields and investigate how 
robust our conclusions are with respect to the prescription adopted 
for the AGN lightcurves of individual accretion events. In order to 
directly compare our predictions to observations, we use the mock 
AGN catalogues constructed with the technique described in the 
previous Section. 

4.1 AGN number counts 

Fig. [3] shows the comparison between the AGN number counts, 
N( > S), predic t ed by our model and the ones measured in the CDFs 
bv lBauer et al.l (2004), where A' is the number of AGN per unit sky 
area and S is their observed flux. The left-hand and right-hand pan- 
els display the number counts of the AGN selected in the soft and 
hard X-ray band, respectively. The dark and light grey shaded areas 
show the observed AGN counts obtained with two different classi- 
fication schemes used to separate AGN from star-forming galaxies, 
one which conservatively estimates the number of AGN and the 
other which cons ervatively estimate s the number of star-forming 
X-ray sources (see lBauer et alj2004l for details). The dashed black 



curves represent the median number counts computed over all 100 
mock Chandra fields. The surrounding bands indicate the 5th and 
95th percentile. Different colours are used to characterize the pre- 
dictions of the different lightcurve models considered in Section 
12.21 As indicated by the labels, the model predictions are separately 
compared both with the whole AGN population and with the type- 
1 and type-2 ones. The width of the coloured areas is a measure 
of the predicted cosmic variance. As shown in Fig. [3] in the flux 
range covered by the available observed AGN luminosity functions 
we recover the same results discussed in M08. In particular, if we 
assume that AGN always shine at the Eddington luminosity (model 
I, blue), the predicted AGN number density is on average too low 
in the flux range ~ 10~ 15 - 10~ 14 erg cm s , especially that of 
the type-2 population. Assuming a lower Eddington ratio at low 
redshifts, as in our model II (red), or a decline phase of the AGN 
activity after an Eddington accretion phase up to a peak mass, as in 
our models III (green) and best (cyan), partly alleviates the prob- 
lem. However, at S < 10~ 15 ergcm _2 s _1 in the soft band, i.e. in 
a flux range accessible only in the X-ray selected deep fields, our 
model systematically overestimates the AGN number density, irre- 
spective of the AGN lightcurve model, a mismatch that increases 
as AGN fluxes and Eddington factor decrease. 

To further investigate this point, in Fig.|4]we show the redshift 
distribution of the AGN in our mock catalogues as a function of the 
lightcurve model, as indicated by the labels. Each model histogram 
has been obtained by averaging over 100 mock catalogues. Uncer- 
tainties in the model predictions are computed by assuming Pois- 
son statistics. The grey shaded histo grams show the r edshift dis- 
tribution measured in the CDFS bv lZheng et alj ( 12004ft . who used 
the photometric redshifts of 342 X-ray sources, which constitute 
99% of all the detected X-ray sources in the field. The solid black 
lines show the AGN redshift distri butions derived by inte grating the 
bolometric luminosity function of Hopkins et al] d2007h . They can 
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Figure 4. The redshift distribution of the AGN in our mock catalogues 
(coloured histograms). Uncertainties in the model predictions are computed 
by assuming Poisson statistics. Grey shade d histograms s how the AGN dis- 
tribution computed in the CDFS by Zheng et al. ( 2004). The solid black 
lines display the AGN n umber counts derive d by integrating the bolometric 
luminosity function of Hopkins et al.l 120071) . 

be considered as upper limits, since this computation does not ac- 
count for the sky coverage of the fields, assuming instead a constant 
flux limit for all the AGN. As can be seen in the Figure, the faint 
AGN population, overestimated by the model as shown in Fig. [5] 
is distributed at all redshifts larger than unity. The mismatch is par- 
ticularly evident in the soft X-ray selected samples. 

As we did check, the number density of AGN with fluxes 
> 10~ 15 ergcm~ 2 s^'predicted by all models (apart from model I) 
is similar to, or slighly smaller than the observed one. On the con- 
trary, all models over-predict the number density of fainter AGN 
that, however, are typically excluded in the mock CDFs. This dis- 
crepancy can be due to one or more of the following reasons: at 
5 < 10~ 15 ergcm _2 s _1 , i) the mechanism triggering the BH mass 
accretion is less efficient than we have assumed, ii) the accretion 
time is overestimated, iii) the model fraction of obscured AGN is 
underestimated. Clearly, the model needs to be further developed 
along these lines to match observations. However, for the purpose 
of studying the AGN clustering in the CDFs, the over-abundance 
of faint AGN in our model does not necessairly represent a prob- 
lem since almost of all of them are excluded from the spectroscopic 
AGN samples of the CDFs (see below). 

4.2 AGN spatial clustering 

We compare the spatial clustering of AGN in our mock CDFs 
with those measured in the real catalogues by iGilli et al] j2005t) 
and investigate the dependence on the AGN luminosity. We 
quantify the AGN clustering properties by means of the two- 
point auto-correlation function in the real space, cj(r), using the 
lLandv & Szalavl £1993) estimator 

= AA(r)-2RA(r)+RR(r) 



where AA(r), RA(r) and RR(r) are the fraction of mock AGN- 
AGN, AGN-random and random-random pairs, with spatial sep- 
aration, r, in the range [r— 8r/2,r + 8r/2]. The random sample 
is obtained by randomly positioning objects within the same light 
cones and according to the selection criteria of the AGN sample. 
The rationale behind computing cj(r) using spatial positions rather 
than redshifts is that we wish to com pare model pr edictions with the 
estimates of IGilli et ail fc005t) and Ipiionis et alj ( feOOol) . in which 
redshift distortions have been corrected for either by projecting the 
redshift space correlation function or by inverting the measured an- 
gular correlation function via Limber's equation. 

To test whether our model is able to m atch the two-point cor- 
relation functions in the CDFs measured bv lGilli et alj d2005h . we 
have extracted mock AGN catalogues that closely mimic the spec- 
troscopic AGN samples, in which only objects with good optical 
spectra, i.e. with spectral quality flag Q ^ 2, are considered. For 
the majority of the AGN in the CDFs, the latter condition is veri- 
fied when Mr < 25, where Mr is the total apparent magnitude in 
the R band, i.e. including the contribution of both the AGN and its 
host galaxy. 

To extract a mock spectroscopic subsample, we have com- 
puted the R band magnitude of all AGN in the mock Chandra Deep 
Fields and rejected all objects with Mr > 25. In addition, since only 
about half of the AGN redshifts in the Chandra Deep Fields have 
been measured, we randomly diluted our sample, keeping only 50% 
of the mock sources. In Appendix [A] we describe the procedure 
adopted to convert the intrinsic bolometric luminosities of model 
AGN into apparent R magnitudes, given the redshift of the object 
and its column density Nfj. The observer frame R magnitudes of the 
host galaxies have been obtai ned assuming the pa r ametr ization for 
dust attenuation proposed bv lDe Lucia & B laizot (2007|). We note 
that the redshift distribution of the mock samples obtained with this 
procedure is remarkably similar to those observed for the spectro- 
scopic samples of CDFs (e.g. Szokoly et al. 2004, Barger et al. 
2003). 

The grey shaded areas in the four panels of Fig.|5]represent the 
power-law model two-point correlation functions that, according to 
IGilli et al] {20053, best fit the correlation properties of the AGN in 
the CDFs. We show the case in which the authors fixed the slope to 
y= 1.4 in order to focus on the difference in the rrj value between 
the two AGN populations, given the large errors introduced by low 
number statistics. The latter are modeled as simple Poisson errors. 
We have repeated the same best fitting procedure to the two-point 
correlation function measured in each of the mock CDFs. The re- 
sult is represented by the bands of different colours. Their width 
represents the field to field variance and accounts for both sparse 
sampling and cosmic variance. Therefore, these errors quantify the 
discrepancy between the ro in the data and the models, under the 
rather strong assumption that y = 1.4. 

The yellow dots represent the two-point correlation functions 
computed using all the AGN pairs in all mock fields. The fact that 
they are located within the coloured areas indicates the adequacy 
of the power-law model adopted for the best fit. As in Fig. [3] we 
show our predictions for the whole AGN population and separately 
for the type- 1 and type-2 AGN. 

The parameters of the best fits are listed in Table Q] together 
with the errors in the form ro ± O, ( (err(ro ) ) ) , where ro is the best 
fit value, o" ro represents the field-to-field rms and (err(ro)) is the 
Poisson uncertainty on ro averaged over all mock fields. When 
comparing the errors in the mocks, that account for b oth sparse 
sampling and cosmic variance, with the Poisson errors o f lGilli et al.1 
(2005), we see that the error budget is dominated by cosmic vari- 





Figure 5. The four panels show the spatial two-point correlation function measured in 100 mock Chandra fields, as a function of the different lightcurve 
models adopted. The grey shaded areas have been computed using the best-fit power-law model with fixed slope 7= 1.4 adopted by Gilli et al. 12005) to the 
CDF AGN real space correlation function. As indicated by the labels, the model predictions are compared both with the whole AGN population and with the 
type-1 and type-2 ones, separately. The yellow dots represent the correlation function of all AGN in the 100 mock Chandra fields. The coloured areas bracket 
the 5th and 95th percentile of the best-fit power-law to the correlation function in each mock sample (see Table[T}. The bandwidth accounts for the different 
sources of uncertainties, including cosmic variance. The fact that yellow dots are found within the shaded region indicates the adequacy of the best fit model. 



ance. In the CDFN, the correlation length of the mock AGN is con- 
sistent with the data. In all models the mean rrj value is smaller 
than the observed one. However, the difference is below 1-G. In- 
terestingly, our model predictions for the ro values are in good 
agreement with the one estimated by considering all extragalactic 
objects with measured redshifts in the C DFN, including g alaxies 



(r = 4.2 ±0 Ah' 1 Mpc; see Table 2 of lGilli et all 120051) . Since 



galaxies make up ~ 30 % of the spectroscopic sample, this fact 
could be explained by assuming that most of these galaxies actu- 
ally contain a weak AGN outshone by their host. 



W e did also perform a two-parameter fit as in iGilli et all 
(2005). In this case, however, the fitting procedure is not robust. 
Different fitting methods provide different results and the scatter 
among the best fitting values of ro and y is comparable, and some- 
times larger, than their formal error. The effect is larger for model 
I that predicts significantly less AGN in the CDFs than the other 
models. Yet, in all models explored a power-law model provides a 
good fit to the measured t,(r) which, for the CDFN, is fully con- 
sistent with the data. For example, for the model dubbed "best" we 
have obtained r = 3.8 ± 0.8 and y = 1.5 ± 0.3 in the CDFN and 
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r : CDFN 



all AGN type 1 type 2 

real catalogue 5.1 (±g;|) 5.6(t%) 4.7 (±%) 

mockl 3.5± 1.7(0.8) 4.7±4.0(1.6) 3.7±3.0(1.5) 

mock II 5.4 ±2.0 (0.08) 5.2 ±2.6 (0.2) 5.3 ±1.7(0.1) 

mocklll 3.9±1. 3(0.2) 4.3±1.7(0.5) 5.1 ±2.1 (0.4) 

mock best 4.1 ±1.3 (0.2) 4.7 ±1.9 (0.4) 4.8 ±1.7 (0.4) 



r : CDFS 



all AGN type 1 type 2 

real catalogue 10.4(0.8) 10.1(j£§) 10.7 (j^) 

mockl 3.7±3.2(1.8) 18 ±5.6 (2.7) 6.2 ±4.3 (2.7) 

mock II 4.7 ±1.6(0.2) 4.9 ±2.0 (0.4) 6.5 ±2.9 (0.4) 

mocklll 4.1 ±2.0 (0.6) 5.2±3.6(1.2) 4.8±3. 1(1.5) 

mockbest 4.2 ±1.7 (0.5) 5.1 ±2.6(1.0) 4.7 ±3. 1(1.5) 



Table 1. The best fit parameters: ro ±O ro ((err(ro))), where i;(r) = (r/ro) , O",- are the field-to-field variances of ro; (en'(i'o)) are the parameters uncertainties 
averaged over the mock fields. 



r = 3.6 ± 0.7 and y = 1.5 ± 0.4 in the CDFS, where the quoted 
errors represent the scatter among the mocks. These values can 
be compared with the measured values rg = 5.5 ±0.6 and y = 
1.50 ±0.12 in the CDFN and r = 10.3 ± 1.7 and y= 1.33 ±0.14 
in the CDFS. A two parameter fit reduces the differences between 
the AGN clustering in the CDFN and CDFS. However, the lack of 
robustness in the two-parameter fitting procedure and the covari- 
ance between ro and y hamper a quantitative estimate. We can only 
conclude that the discrepancy between the model and the observed 
two-point correlation functions measured in the CDFS is smaller 
than the 2-2.5 O" difference in the correlation lengths ro. 

Many possible effects may help to further alleviate the ten- 
sion between model and data. For example, we have seen that the 
error budget is dominated by cosmic variance that we have esti- 
mated using mock catalogs extracted from the Millennium Sim- 
ulation. Although very large, the computational box is still small 
for sufficiently rare events. For example, it is not sufficient to con- 
tain one z = 6 Sloan quasar on average. And clustering statistics 
is more sensitive to simulation volume than most other quantitites 
one typically considers. Yet, the Millennium Simulation box can 
accomodate about 100 independent Chandra fields and thus the 
true variance should not be significantly larger than the estimated 
one. Alternatively, the analysis of the real data might be affected 
by errors that have not been accounted for in the analysis of the 
mock samples. For exam ple, the spatial two-point correlation func- 
tion of iGilli et alj j2005h has been obtained from the projected one 
assuming a power-law model. Possible deviations from the power- 
law shape would also contribute to errors. However, according to 
our models, these errors should be negligible, since the mock AGN 
correlation function is well approximated by a power law. Several 
examples can be worked out. However, in order to significantly af- 
fect our results, these hidden errors must be comparable to cosmic 
variance which, as we have seen, is larger than sparse sampling 
error. 

Uncertainties in model predictions provide an additional way 
to reduce the discrepancy between model and data. For example, 
the clustering of our mock AGN could be enhanced by forcing 
models to preferentially populate highly biased, massive haloes. 
This would increase the AGN correlation length in both CDFN 



and CDFS and reduce the mismatch between model and data. More 
physically motivated AGN models may predict very different prop- 
erties for AGN that populate haloes of a given mass. This would 
increase the so-called stochasticity of the AGN bias and increase 
the size of the c oloured regions in Fig. l5l dDekel & Lahavlll999l: 
ISigad et alj 2000). However, it is not at all obvious how to achieve 
this task. 

The other possibility, of course, is that the discrepancy be- 
tween CDFN and CDFS is significant and that the observed cluster- 
ing of the AGN in the CDFS is unusually large. An indication that 
this may indeed be the case is provided by the AGN two-point cor- 
rel ation funct i on rec ently measured in the XMM-COSMOS fields 
bv lGilli et al.1 fl2009l) which is consistent with that of CDFN and, as 
we have verified, with our model predictions, but not with that of 
CDFS. 

Finally, as can be seen in Fig. [5] we stress that our conclusions 
are robust with respect to the lightcurve model assumed. Moreover, 
as we have verified, our results are almost unchanged when using 
different assumptions in converting AGN bolometric luminosities 
into optical apparent magnitudes. 



4.2.1 Luminosity dependent AGN clustering 

Plioni s"et alj d2008h have recently investigated the clustering of the 
AGN in the CDFs as a function of their luminosity. The authors 
have measured the two-point angular correlation function of the 
objects in different flux-limited subsamples and then used Limber's 
equation to derive the spatial clustering length ro- They found a 
strong dependence of ro on the median X-ray luminosity of each 
flux-limited subsample in both the CDFN and CDFS and in the 
soft and hard X-ray band. 

To investigate whether we find a similar trend in our model, 
we have extracted different flux-limited subsamples from the mock 
Chandra fields, characterized by different values of Fr lm i t and, 
therefore, by a median X-ray luminosity (Lagn.x)- The cluster- 
ing length of the mock AGN in each subsample has been esti- 
mated by fitting their spatial two-point correlation functions with 
a power-law. The results are shown in Figure [6] in which we plot 
the values of ro as a function of (Lagn.x) f° r the AGN in the mock 
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CDFN (upper panels) and CDFS (bottom panels). The results of our 
four lightcurve models are represented by different symbols (model 
I: blue triangles, II: red squares, III: green p entagons, bes t : cyan 
hexagons) and compared with the results of IPlionis et al.l ( 2008) 
(black dots). Model predictions have been obtained by averaging 
over 100 different mock catalogues for each lightcurve model. Er- 
rors show the scatter among the mock fields. 

In all models the correlation length is almost constant with 
luminosity, showing just a slight increase at high luminosities, in 
disagreemen t with the strong luminosity dependence r$ found by 
Plioni s~et alj d2008l) . Although small, the precise trend in the mock 
catalogues depends on the lightcurve model adopted. For instance, 
in model best the dependence is quite mild, while in model II rg sig- 
nificantly increases already above (Lagn,x) ~ 10 42 5 ergs^ 1 . The 
spread in the model predictions makes the clustering luminosity 
dependence a possible observational test to discriminate among dif- 
ferent theoretical models if they can be compared with larger sam- 
ples in order to reduce the size of the error bars. The sample of 
AGN with measured redshift in the 2 deg 2 XMM-COSMOS field 
represents an important step in this direction. Interestingly enough, 
the correlation length of ~ 500 AGN with typical X-ray luminos- 
ity of 10 43,8 ergs _1 in the 0.5-10 keV band is in the range 6-8 
h Mpc (depending on whether a prominent structure at z — 0.36 is 
included or not in the sample), si gnificantly smaller than the value 
estimated bv lPlionis et "all d2008h and in good agreement with the 
one predicted by our models (Gilli et al. 2008). 

We note that a global study of the clustering properties of sim- 
ul ated AGN no t restri cted to the Chandra fields will be presented 
in iBonoli et all 120081) . We anticipate here a similar result for the 
luminosity dependence of AGN clustering: ro is found to be only 
weakly dependent on luminosity, in particular in the redshift range 
z ~ 2 — 3, that corresponds to the peak of the AGN number density. 



5 CONCLUSIONS 

In this paper, we modelled the AGN spatial distribution mea- 
sured in the Chandra deep fields within the framework of hierar- 
chical co-evolution of BHs and their host galaxies. For this pur- 
pose, we have app li ed the semi-analytic techn iques developed by 
ICroton et alj d2006h , |Pe Lucia & Blaizol d2007l) and M08 to follow 
the co smological evolution of AGN inside the Millennium Simu- 
lation dSpringel et al] |2005). and extracted a number of indepen- 
dent mock catalogues of AGN that closely resemble the CDFS and 
CDFN. Each mock CDF catalogue has been obtained by including 
all AGN within a past light cone of a generic observer that meet 
the same selection criteria (field of view, flux limit, edge effects) 
as the real sample. The large volume of the Millennium Run al- 
lowed us to extract hundreds of independent mock CDFs in which 
we have measured the spatial two-point correlation function of the 
mock AGN in real-space. We have ignored redshift space distor- 
tions sin ce these are alrea dy c orrected for in the o bservational esti- 
mates o flGilli et alj d2005h and lPlionis et alj d2008h . which we wish 
to compare with. 

The main results of this study can be summarized as follows: 
( i) The number counts of bright model AGN agree with obser- 
vations both in the soft and in the hard X-ray bands. The abundance 
of model AGN at fluxes below < 10~ 15 ergcm~ 2 s _1 , however, is 
larger than observed. The amplitude of the mismatch depends on 
the lightcurve model explored and on the AGN intrinsic absorption. 
In fact, our models seem to underpredict the abundance of type-2 
objects. 
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Figure 6. The values of ro as a function of the median X-ray luminosity 
(Lagn.x) for the AGN in the mock CDFN (upper panels) and CDFS (bot- 
tom panels). The results of our four lightcurve models are represented by 
different symbols (I: blue triangles, II: red squares, I II: gTeen pen t agons , 
best: cyan hexagons) and compared with the results of Plionis et al. (2008) 
(black dots). Model predictions have been obtained by averaging over 100 
different mock catalogues for each lightcurve model. Errors show the scat- 
ter among the mock fields. 



(ii) The number of mock AGN in the simulated CDFs in the 
redshift range 1.5 < z < 4 is higher than observed in the soft X-ray 
band. The mismatch is less evident in the hard X-ray band. This 
discrepancy in the redshift distributions is not unexpected since, 
as discussed by M08, the same hybrid model considered in this 
work over-predicts the abundance of faint objects with redshift in 
the range z < 4 (see their Fig.7). 

(Hi) The spatial two-point correlation function predicted by 
all lightcurve models is well described by a p ower-law o u t to 2 
/j _1 Mpc. If one set the slope y = 1.4, as in iGilli et al.1 d2005h . 
then the correlation length ro agrees, to within 1 fj, with that mea- 
sured by iGillietal] 12005) in the CDFN once cosmic variance is 
accounted for. On the contrary, the mock AGN in the CDFS are 
much less correlated than the real one. In this case, the discrepancy 
in the correlation lenght is of the order of 2-2.5 O", depending on the 
lightcurve model adopted. 

( iv) The mismatch is alleviated by performing a two-parameter 
fit to the two-point correlation function. However, a quantitative es- 
timate is hampered by the lack of robustness in the two-parameter 
fitting procedure which results from low number statistics. The ten- 
sion between model and data is further alleviated by possible ob- 
servational errors that are not properly accounted for and by model 
uncertainties. Overall, one expects that the discrepancy between the 
observed and modeled %(r) is smaller than the 2-2.5 rj mismatch in 
the correlation lengths quoted previously. 

(v) The agr eement between c orrelation functions in the XMM- 
COSMOS field dGilli et alj|2009h and in the CDFN which, as we 
have shown, is well reproduced by our AGN models suggests that 
the AGN clustering in the CDFS is indeed unusually high. 

( vi) The models predict that the clustering amplitude depends 
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little on the luminosi ty of AGN, in dis agreement with the strong 
dependence found bv lPlionis et aljj2008l) but in agreement with the 
measurements of the clustering of lu minous AGN in the recently 
complied XMM-COSMOS catalogue dGilli et alj2009h . 

Precise predictions for the luminosity dependence of the AGN 
clustering depend on the adopted theoretical models, and their 
present mutual agreement merely reflects the still large field-to- 
field variance. Therefore, one can hope that measuring the AGN 
clustering properties as a function of their luminosity in larger 
datasets could help discriminating among the models. Furthermore, 
going beyond the spatial AGN autocorrelation function, the analy- 
sis of the cross-correlation between AGN and galaxies in the next 
generation all-sky surveys at z ^ 1, like EUCLID or ADEPT, will 
place strong constraints on modern semi-analytic models, thereby 
shedding light on the complicated mechanisms that regulate the co- 
evolution of AGN and galaxies. 
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APPENDIX A: FROM AGN INTRINSIC LUMINOSITIES 
TO OBSERVED MAGNITUDES 

To convert the intrinsic bolometric luminosities of the model AGN, 
L(, i, into absorbed apparent R-band magnitudes, given the AGN 
redshift and Nh, we make the f ollowing steps . First, we use the 
bolometric correction given by iHopkins et al.l d2006h to get the 
AGN intrinsic B-band luminosity, Lb - Then, we get the monochro- 
matic unabsorbed R-band luminosity, assuming: 

-0.44 



Z™ ABS = L B , V (-) (Al) 



V 



where 



X 2 X 2 Lb 

L By = — L B y ' 

c 



v B = c/(445nm), v = (1 +z)v R = (1 +z)c/(658nm), AX B ~ lOOnm 
and c is the speed of light. The absorbed monochromatic luminosity 
can be obtained with the following equation: 

L ABS =L UNABS xl() -0.4A ) (A2) 

where 

A = A v + ^-(0.000843x 5 -0.02496x 4 + 

0.2919x 3 - 1.815x 2 +6.83x-7.92)) , (A3) 

x = \~ x in /am' 1 and Av = 5 x 10 _22 A^ H dGaskell & Benkej2007l) . 

Finally, to get the apparent R magnitude in the observer frame, 
we use: 

«AB = 8.9-2.51og(/v//y), (A4) 
where / v , the monochromatic flux expressed in units of Jansky, is: 

,ABS 

/v = (1+z) 4^F- (A5) 

and di(z) is the luminosity distance. 

To get the total R-band magnitudes of mock objects, the AGN 
magnitude computed as described ab ove is finally combined w ith 
that of the host galaxies obtained bv lDe Lucia & Blaizotl |2007). 



